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ABSTRACT 

In a variation of the traditional research report, 
the background and design of a study of second language acquisition 
are described, but no substantive results are reported. The intent of 
EEH iS t0 " iJnulate thinking about theory in language testing. 
Two questions are investigated: What is it that children learn in 
second language classes?; and How do those things influence each 
other as abilities develop? The study is placed in the context of 

J^n^Lr 09 ^ in En9li8h aS a S* 00 ™ Language (ESL) within the 
francophone school system of Quebec, Canada; and of two contrasting 
views of trait development based on acquisition of grammatical 
competence or communicative competence as the trait driving further 
learning, seven test pairs were developed, each using a different 
measurement method, with one test in each pair measuriw grammatical 
and one measuring communicative competence. These prototype measures 
were validated and a variety of alternate trait indicators 
identified. A time series study of trait development win be 
undertaken using these measures. Questions concerning the 
interpretabiiity of the results are currently under consideration, 
including number of indicators needed, interchangeability of 
measurement methods, and use of an abbreviated or simplified testing 
model. A 23-item bibliography, testing models, and test method 
examples are appended. (MSE) 
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FOREWORD 

One of the early traditions of the Language Testing Research 
Colloquium < LTRC ) was the development of research projects in 
testing. Colleagues with new ideas would bring them to the 
colloquium for the advice and criticism of their colleagues. New 
ideas became research plans; plans became projects; the quality 
of these projects was better for having been considered and 
criticised. 

This early tradition has been a victim of the success of the 
LTRC. What began as a get-together of less than two dozen 
colleagues has become a respected conference. With audiences in 
excess of a hundred people and a schedule that grants forty 
minutes or less to each presenter, there is little opportunity 
for considered, critical discussion. Under these constraints the 
LTRC has become primarily a forum for the presentation of 

research reports. 

Measurement in language abilities has an important place in 
applied linguistic research. It is, first of all, the link 
between theory and phenomena of interest <see Zeller S. Carmines 
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1980) in our field. Without this link, theories of language use 
or learning may be elegant and internally consistent but they 
remain essentially meaningless. At the same time, it is not only 
that measurement is necessary to instantiate theory. Measurement 
may also inform theory (see e.g., Andrich 1988 for a review of 
the notion of fundamental measurement). 

In this "paper" we are concerned to recall the older 
tradition of the LTRC in service of our desire to give 
empirically grounded meaning to, and to further the development 
of theory concerning language teaching. The aim sounds 
altogether a bit too grand. In fact it is much more mundane. We 
are seeking guidance as we begin to investigate two questions: 
What exactly is it that kids learn in second language classes? 
And how do those things influence one another as abilities 
develop? 

This paper will take the form of a research report. Only 
the substance will be different. The background and design of 
the study will be described, but there will be no substantive 
results. Instead we will ask questions of the audience. We 
trust that your answers will aid our work just as the advice of 
colleagues has helped the work of past participants in this 
col loquium. 



I NTRODUCT I ON 
Background; Theory and Practice 

The predominant conceptualization of communicative 
competence (CO today is component ial (e.g., Bachman, 1990} 
Canale, 1983; Canale t* Swain, 1980; Hymes, 1972; Munby, 1978? 
Savignon, 1983). The various descriptions of CC are consistent 
in their recognition that communicative language ability involves 
both "knowledge or competence and the capability for implementing 
that competence in language use" (Bachman, 1990, p. 108). 

Within the current descriptions of CC, one of several 
components is grammatical or organizational competence (GO • 
Second language (L£> teaching practices offer contrasting views 
of the process of language development concerning GC and CC 
within the component ial conceptualization. 

One view professes that an increase in GC results in an 
augmentation of CC. This view is implicit in courses that focus 
on language usage. It is influenced by structural linguistics 
and is in line with the traditional approach to second language 
teaching. As a rule classroom practices emphasize the study and 
analysis of language form, stress the mastery of discrete 
elements, and tend to be teacher -dominated with the purpose of 
guiding students toward grammar accuracy (Brumfit, 1984). 

The contrasting view proposes that an increase in CC through 
the application of communicative strategies provides the 
requisite precondition for development of GC. This view is 
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implicit in courses that emphasize language use <i.e., a 
communicative curriculum). In general, classroom practices focus 
on activities in which students are actively interacting with and 
using language to construct meaning for themselves and others. 
Stress is placed on the development of skills and strategies to 
help guide students to participate in language experiences. 
Activities tend to be learner-centered and meaning-based to 
encourage language use leading to fluency (Brumfit, 1980). 

in simple terms, one view holds that language development is 
GC driven and the other that language development is CC driven. 
Even though the CC driven view is influenced by the emerging 
sociocultural focus on the nature of language and the 
cognitively-based focus on the nature of language learning, the 
theories behind it are less well articulated than are those 
underlying the GC driven view (Dubin * Olshtain, 1986). The lack 
of articulation is in turn reflected in the presence of a variety 
of language teaching practices which apparently result from 
different interpretations of the CC driven point of view. There 
appears some uncertainty as to how or even whether instruction in 
grammar should be provided. 

One answer to these questions is provided by Dickins and 
Woods (1988). They point out "that the rise of the 
notxonal/functional/communicative curriculum has sometimes been 
accompanied by a devaluation of grammar as one of the organizing 
principles in commercially available language- learning materials" 
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<p. 623). They take the view that "grammar does not function as 
an end in itself but, rather , as a means toward successful 
communication, " and therefore should not be ignored in 
instruction (p. 636) • 

A somewhat different interpretation is evidenced by 
Krasben's claim, "that by 'going for meaning 1 the learner will 
automatically acquire structure, and that language development is 
a matter of moving from meaning to form rather than the other way 
round" (Nunan, 1985, p. 29). Another example can be found in 
Ellis <1990) who discusses what he labels the "cognitive anti- 
method." He states that even though this method had little 
impact on classroom practices, its underlying assumptions have 
been "incorporated into subsequent theories of classroom language 
learning derived from L2 acquisition research" (p. 35). One 
assumption in particular that has influenced various L£ teaching 
practices today is that linguistic analysis is not necessary 
(i.e., it is not necessary to attend to linguistic form in order 
to acquire an LS) . 

Such ideas tend to support naturalistic LS acquisition. 
With continual experience and exposure to the L2, a learner will 
gradually internalize the linguistic code and be able to 
communicate competently. According to Johnston (198**, as cited 
in Nunan, 1985) the learning that occurs will always be 
contingent on the learner's stage of development. 
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. „-.r*»rsa chow variation across the two 
In summary, teaching practices snow van- 

views concerning the development of language abilities. In 
addition, there are differences within the CC driven view. 
Cummins and Swain (1986) suggest that much more research is 
needed in order to fill in the specifics of the theories that 
underlie LS teaching practices within the CC driven view. 
Tnv^tioatT"" fit T *° Contrasting View s 

From L2 teaching practice, if not from theory, two distinct 
notions about general language development can be identified. In 
the first, CC depends upon and lags behind GC. In the 
contrasting view, GC depends upon and lags behind CC. An 
investigation into the relative adequacy of these two accounts 
was seen as a logical and promising next step, which of the two 
views , the SC driven view or the CC driven view, provides the 
best explanation (seems best to account) for what happens in LB 
development? Examining the trait development of grammatical 

. ■ , a -Hiiitv could provide information to 
knowledge and communicative ability couio p» wv 

educators, curriculum developers, and researchers alike. 

One recent and promising statistical technique for the study 
of theoretical models is covariance structure analysis. It can 
De implemented as causal modeling or confirmatory factor 
analysis. The use of confirmatory methods has only just begun in 
the language sciences (e.g., Bachman L Palmer, 1983; Harley, 
Allen, Cummins, b Swain, 1990; Nelson, Lomax t, Perlman, 1984; 
Purcell, 1983; Sang, Schmitz, Vollmer, Baumer & Roeder, 1986; 
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Turner, 1989). The focus of this research has been trait 
organization in university-level students and adults. With the 
exception of Harley et al. (1990), there are no reported studies 
on the structure of language abilities in children and 
adolescents which use confirmatory methods. In addition, even 
though there exist studies on longitudinal experiments concerning 
language development in children and adolescents (e.g., French 
immersion students in Canada and English as a Second Language, 
ESL, "submersion" students in the United States), none of this 
research implements time series confirmatory methods. 
Re— arch Study qonjexti — Intensive ESL Programs in Quebec. Canada 
In 1981, LS instruction in the province of Quebec moved from 
a grammai — based curriculum to a communicaton-based curriculum 
(see Ministers de 1 'Education, Gouvernament du Quebec, 198<f). 
This motivated the development of alternative approaches to the 
teaching of ESL. As a result, what has become known as intensive 
ESL programs came into existence within the French-speaking 
school system. In general these programs are implemented in 
grade five or six. (ESL is a required subject from grade four.) 
Instead of the regular ESL program of 120 minutes per week, 
students are immersed m five hours of ESL instruction per day 
for a period of five months of one school year. All other 
required academic subjects are given in French during the other 
months of the year. 
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The setting of inteneive ESL programs was considered an 
appropriate context for the investigation of trait development as 
discussed above within two current contrasting views of general 
ua development. The actual language teaching practices t.„t take 
pl, ce in such programs provide the conditions that would allow 
for either of the hypothesised processes to operate. Instruction 
appears to be both language based and content based, because 
Xanguage based, the GC driven hypothesis could work, because 
content based, the CC driven hypothesis could work. According to 
Brinton, Snow and wesche (1989), such instruction is identified 
within what has become known as the "content-based movement." 
They define content-based instruction "as the integration of 
content learning with language teaching aims" (p. vii). They 
claim that L2 structures, functions, and discourse features can 
be provided through the use of authentic texts. They go on to 
say, however, that even within this "movement" there are two 
different views concerning the role of content in authentic texts 
and language teaching. One view is that all the feature, that 
are provided, "once identified, can then be taught at least 
partially in isolation, with lessons focused on particular 

f „„ tit , ns an d patterns" <p- S). This reflects 
language forms, functions, ana panv 

The second view is that "the emphasis 
the SC driven hypothesis. Tne seconu 

, ,„„,„,. itself orovides an effective means 
on the informational content ltseiT pro 

r.f the lanauaae features it presents" 
for incidental acquisition of tne language 

(p. 2>. This represents the CC driven hypothesis. Brinton et 
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al. endorse neither view. They claim that more research is 
needed to investigate the actual process of language and content 
learning * For the moment at least , they support the use of 
content-based L2 instruction classrooms in that the teaching of 
form is frequently combined with experiential methods. This is 
the kind of classroom that is available for our research* 

METHOD 

The investigation of trait development includes four related 
stages: 

(1) Development of feasible prototype measures of two 
traits, 

(£) Validation of prototype measures and selection of 
indicators. 

(3) Development of alternate forms of indicators, 

(4) Time series study of trait development. 

The schedule for the different stages is shown in Figure 3. It 
is anticipated that the full study will require three years for 
completion. Part of this reflects sequential requirements, and 
part reflects the constraints imposed by academic scheduling in 
the schools where the research is to be carried out. 
Development of feasible prototype m easures of two traits 

During the Spring of 1990 six graduate students in applied 
linguistics at Concordia University prepared eight pairs of ESL 
tests. Each pair employed the same measurement method. One test 
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of each pair was designed to test formal grammatical knowledge 
and the other to test communicative ability. The test developers 
reviewed curricular materials and methods. They produced tests 
that were designed to incorporate only linguistic, notional, 
functional and thematic material from the students' program. 
They made certain also that test methods were known from regular 
classroom experience. There were six oral test pairs* four 
individually administered speaking test pairs, one group 
administered speaking test pair and one group administered 
listening test pair. There were two group administered test 
pairs requiring reading of English. No tests required students 
to write in English. The group speaking test was not considered 
to be feasible for this study because it required elaborate video 
studio capabilities for administration. The other seven test 
pairs were tried out for feasibility with £B students similar to 
those who will participate in the time series study. These 
subjects were French speaking, Grade 5 students in their fourth 
month of intensive ESL instruction. 

The seven test pairs are briefly described belowt 
Method 1. Sentence production. Visual cues, 15 sentences 
produced in English. Responses scored for accuracy and 
appropriateness to the cues. 

Method a. Elicited imitation of fifteen sentences. 
Responses scored for grammatical accuracy and for 
reproduction of content. 
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ffftthgd 3» Retell the story of a two and a half minute 
television presentation. Scored both for formal accuracy of 
speech and for fidelity and completeness of retelling. 
Method fr. Oral translation. Student gives English 
translations for 15 audio-taped sentences in French. Visual 
support is provided. Scored for formal accuracy and 
fidelity of translation. 

Method 5. Multiple-choice translation of sentences from 
French. One answer choice is semantical ly and gramat ical i / 
incorrect? one is semantical ly and gramatically correct? the 
other two choices are only correct either semantical ly or 
gramat ical ly . 

Method 6. Sentence judgement task. One set of 15 sentences 
to be judged for grammatical correctness. A second set of 
15 sentences judged for truth. 

Method 7. Multiple-choice written test with picture 
support. 15 items require selection of correct grammatical 
form from among four choices all of which are semantical ly 
congruous with the picture. 15 items require selection of 
the semantical ly congruent option from among four 
grammatically correct choices. 

Test methods were considered feasible if several criteria 
were satisfied; No more than one student could fail to 
understand the task for either the grammatical or communicative 
test. The method should yield variance in both trait scores. 
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There should be no problems encountered in adminisvr ation. 
Testing time should be brief - less than ten minutes for group 
tests of 15 items with instructions and less than five minutes 
for individually administered tests. 

Only test Method 6 proved unfeasible in the way in which it 
was administered. Students were not able to restrain their 
laughter upon hearing such untrue sentences as, "It's nice and 
warm in Montreal in January." Under those circumstances 
compromise was inevitable. It is likely that the method would 
prove satisfactory if used in individual administrations. 

Because of the small number. of students who participated in 
the feasibility study no further analysis of test results was 
undertaken. Rating scales may have to be refined for some 
measures which will be validated. 

Stage B, the first step yet to be completed in our research 
is the validation of feasible measures. Because the relation 
between traits is a major focus of the study, divergent validity 
is crucial. Accordingly mul t i trait-multimethod procedures will be 
employed in the study. LISREL will be used for data analysis. 

Two questions relating to the validation stage are still 
unanswered. The first concerns tradeoffs between number of 
instruments and length of each instrument when conducting an MTMM 
study with only a limited amount of testing time available. We 
will need multiple indicators of each of the two traits under 



0 

ERIC 



.3 



Trait Development 



13 



investigation, at least three. Do we then use a larger number of 
instruments in the validation study to increase the chances that 
we will have looked at possible "winners"? Or do we use the most 
reliable instruments we can (i.e., with greater length) to 
increase the chances that any "winners" among the tests examined 
will indeed be recognized? 

The second question is related to the first in that it is 
also related to sensitivity. How does one determine a reasonable 
sample size for an MTMM study with two latent variables and any 
given number of indicators? 

Development of alternate fores of indicators 

Four forms of each indicator will be required in the study. 
Their production is the third stage of our work. Test methods 
will be those indicated by the results of the validation study. 
Procedures for development will be the same as those followed in 
making the original tests. Length may be increased from that of 
the forms used in the validation stage if their reliabilities 
were unnecessarily low. 
Time series study of trai t development 

At the beginning of next year we anticipate the start of the 
final stage of the project, a longitudinal study of the 
development of 6C and CC in grade 5 students of English as a 
second language. Subjects will be tested at four times during an 
intensive ESL course that provides 25 hours of instruction per 
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week. Testing will take place at three week intervals during the 
10th, 13th, 16th and 19th weeks of the course. 

Panel data from the tests will be analyzed to determine fit 
to two different cross-lagged time series models. One of the 
models represents the GC driven view of language developments the 
other model represents the CC driven model. LISREL will be used 
for analysis of the two models. The model with the better fit 
will suggest the better explanation for trait development. 

The two models that will be confirmed are illustrated in 
simplified form in Figures 1 and 2. The two traits are 
represented by the circlesf the numbers within circles indicate 
testing times. Grammatical competence is indicated by an F_ (for 
"formal") within a circle, communicative competence is indicated 
by a C. The figures are simplified to show only two indicators 
for each latent variablej they eliminate error and method 
effects, shocks, etc. The purpose of the figures is to emphasize 
the contrast in lags that characterizes the two views of general 
L2 development. <The models, not just the figures, are also 
simplified* they do not incorporate other components of 
communicative competence.) 

It should be noted that the study will not provide a "proof" 
for one of the competing hypotheses. The two models are not 
congeneric. Results can be taken only as indicative. 

We already have three questions we want to find answers for 
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before we start work on this stage of the study. Two ,are related 
to measures. The third is related more to data analysis. 

The first question refers to the number of indicators 
needed. The answer will be given in part by the need to 
over identify the models that are being estimated. There may be 
other considerations, however, such as, for example, desirable 
degrees of over identification. The second question, also related 
to measures, concerns test methods. What might be lost if one 
does not use the same sets of methods in measuring both of the 
traits? That is, might that in some way (How?) create a bias 
towards a better fit for one of the models? 

The third question is concerned with using an abbreviated or 
simplified model. Are we risking an artif actual bias towards one 
of the models by failing to include other components of 
communicative competence or of higher order factors - either 
constant factors or stochastic processes? 

These questions are indicative of our concern that results 
may be interpretable in the way that we would wish to interpret 
them. There may be other, more important questions that we have 
failed to ask. If so, we would hope to learn what they are, and 
also, if possible, what their answers are. 
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FIGURE 3. Schedule for investigation. 
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(Text heard by subjects in bold) 





Don't gov like chocolate cake? 
Don't goo like chocolate coke? 
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Hon pere n'aime pas la cram glecee. 
Hon pare n'aime pas !a crams glecoa. 



Nathalie danse avac San professsur. 

(A) Nathalie's dancing with her teacher. 

(B) Nathalie's reading with her teacher. 

(C) Nathalie's dancing with his teacher. 
(0) Nathalie's reading with his teacher. 



Method 6. 



John are a student. 



It's usually verg cold in July. 
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Sho's a 

a) nurse 

b) tourist 

c) cook 
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